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[57] ABSTRACT 

The illusion of distinct sound sources distributed 
throughout the three-dimensional space containing the 
listener is possible using only conventional stereo play- 
back equipment by processing monaural sound signals 
prior to playback on two spaced-apart transducers. A 
plurality of such processed signals corresponding to 
different sound source positions may be mixed using 
conventional techniques without disturbing the posi- 
tions of the individual images. Although two loudspeak- 
ers are required the sound produced is not conventional 
stereo, however, each channel of a left/right stereo 
signal can be separately processed according to the 
invention and then combined for playback. The sound 
processing involves dividing each monaural or single 
channel signal into two signals and then adjusting the 
differential phase and amplitude of the two channel 
signals on a frequency dependent basis in accordance 
with an empirically derived transfer function that has a 
specific phase and amplitude adjustment for each prede- 
termined frequency interval over the audio spectrum. 
Each transfer function is empirically derived to relate to 
a different sound source location and by providing a 
number of different transfer functions and selecting 
them accordingly the sound source can be made to 
appear to move. 

10 Claims, 16 Drawing Sheets 
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under specialized conditions in a recording studio, led 

SOUND IMAGING METHOD AND APPARATUS to systematic investigations of the conditions required 

to produce this audio illusion. Some years of work have 

This is a continuation of application Ser. No. produced a substantial understanding of the effect, and 

07/398,988, filed Aug. 28, 1989 now abandoned. 5 the ability to reproduce it consistently and at will. 

According to the present invention, an auditory illu- 
BACKGROUND OF THE INVENTION sion u produce d that is characterized by placing a 
Field of the Invention sound source anywhere in the three-dimensional space 
This invention relates generally to a method and surrounding the listener, without constraints imposed 
apparatus for processing an audio signal and, more par- 10 by loudspeaker positions. Multiple images, of indepen- 
ticularly, to processing an audio signal so that the resul- dent sources and in independent positions, without 
tant sounds appear to the listener to emanate from a known limit to their number, may be reproduced simul- 
location other than the actual location of the loudspeak- taneously using the same two channels. Reproduction 
ers requires no more than two independent channels and 
Human listeners are readily able to estimate, the di- 15 two loudspeakers and separation distance or rotation of 
rection and range of a sound source. When multiple the loudspeakers may be varied within broad limits 
sound sources are distributed in space around the lis- without destroying the illusion. Rotation of the listen- 
tener, the position of each may be perceived indepen- er's head in any plane, for example to "look at" the 
dently and simultaneously. Despite substantial and con- image, does not disturb the image, 
tinuing research over many years, no satisfactory the- 20 The processing of audio signals in accordance with 
ory has yet been developed to account for all of the the present invention is characterized by processing a 
perceptual abilities of the average listener. single channel audio signal to produce a two-channel 
A process that measures the pressure or velocity of a signal wherein the differential phase and amplitude 
sound wave at a single point, and reproduces that sound between the two signals is adjusted on a frequency 
effectively at a single point, will preserve the intelligi- 25 dependent basis over the entire audio spectrum. This 
bility of speech and much of the identity of music. Nev- processing is carried out by dividing the monaural input 
ertheless, such a system removes all of the information signal into two signals and then passing one or both of 
needed to locate the sound in space. Thus, an orchestra, such signals through a transfer function whose ampli- 
reproduced by such a system, is perceived as if all in- tude and phase are, in general, non-uniform functions of 
strumems were playing at the single point of reproduc- 30 frequency. The transfer function may involve signal 
t j on inversion and frequency-dependent delay. Further- 
Efforts were therefore directed to preserving the more, to the bet knowledge of the in ventors the transfer 
directional cues contained inherently in the sounds dur- functions used in the inventive processing are not deriv- 
ing transmission or recording and reproduction. In U.S. able from any presently known theory. They must be 
Pat. No. 2,093,540 issued to Alan D. Blumlein in Sep- 35 characterized by empirical means. Each processing 
tember, 1937 substantial detail for such a two-channel transfer function places an image in a single position 
system is given. The artificial emphasis of the difference which is determined by the characteristics of the trans- 
between the stereo channels as a means of broadening fer function. Thus, sound source position is uniquely 
the stereo image, which is the basis of many present determined by the transmission function, 
stereo sound enhancement techniques, is described in 40 For a given position there may exist a number of 
detail. different transfer functions, each of which will suffice to 

Some known stereo enhancement systems rely on place the image generally at the specified position, 

cross-coupling the stereo channels in one way or an- If a moving image is required, it may be produced by 

other, to emphasis the existing cues to spatial location smoothly changing from one transfer function to an- 

contained in a stereo recording. Cross-coupling and its 45 other in succession. Thus, a suitably flexible implemen- 

counterpart crosstalk cancellation both rely on the ge- tation of the process need not be confined to the pro- 

ometry of the loudspeakers and listening area and so duction of static images. 

must be individually adjusted for each case. Audio signals processed according to the present 
It is clear that attempted refinements of the stereo invention may be reproduced directly after processing, 
system have not produced great improvement in the 50 or be recorded by conventional stereo recording tech- 
systems now in widespread use for entertainment. Real niques on various media such as optical disc, magnetic 
listeners like to sit at ease, move or turn their heads, and tope, phono record or optical sound track, or transmit- 
place their loudspeakers to suit the convenience of ted by any conventional stereo transmission technique 
room layout and to fit in with other furniture. such as radio or cable, without any adverse effects on 

55 the auditory image provided by the invention. 

OBJECT AND SUMMARY OF THE INVENTION ^png process of ^ present invention may be 

Thus, it is an object of the present invention to pro- also applied recursively. For example, if each channel 

vide a method and apparatus for processing an audio of a conventional stereo signal is treated as a monopho- 

signal so that when it is reproduced over two audio nic signal, and the channels are imaged to two different 

transducers the apparent location of the sound source 60 positions in the listener'space, a complete conventional 

can be suitably controlled, so that it seems to the listener stereo image along the line joining the positions of the 

that the location of the sound source is separated from images of the channels will be perceived. In addition, at 

the location of the transducers or speakers. the time the stereo record or disc is being recorded on 

The present invention is based on the discovery that multitrack tape, having for example twenty-four chan- 

audio reproduction of a monaural using two indepen- 65 nels, each channel can be fed through a transfer func- 

dent channels and two loudspeakers can produce highly tion processor so that the recording engineer can locate 

localized images of great clarity in different positions. the various uistruments and voices at will to create a 

Observation of this phenomenon by the inventors, specialized sound stage. The result of this is still two- 
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counterclockwise from line 105 to a line 107 between 
listener 103 and image position 104. Similarly, the image 
slant range (r) is defined as the distance from listener 
103 to image position 104. This range is the true range 
measured in three-dimensional space, not the projected 
range as measured on the plan or other orthogonal 
view. 

In the present invention the possibility arises of im- 
ages substantially out of the plane of the speakers. Ac- 
cordingly, in FIG. 2 an altitude angle (b) for the image 
is defined. A listener position 201 corresponds with 
position 103 and an image position 202 corresponds 
with image position 104 in FIG. 1. Image altitude angle 
(b) is measured upwardly from a horizontal line 203 
15 through the head of listener 103 to a line 204 joining the 
listener's head to image position 202. It should be noted 
that loudspeakers 101, 102 do not necessarily lie on line 
203. 

Having defined th image positional parameters with 
20 respect to a reference listening configuration, we pro- 
ceed to define parameters for possible variations in the 
listening configuration. Referring to FIG. 3, loudspeak- 
ers 301 and 302, and lines 304 and 305 correspond re- 
spectively to items 101, 102, 106, and 105 in FIG. 1. A 



channel audio signals that can be played back on con- 
ventional reproducing equipment, but that will contain 
the inventive auditory imaging capability. 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 is a plan view representation of a listening 
geometry for defining parameters of image location; 
FIG. 2 is a side view corresponding to FIG, 1; 
FIG. 3 is a plan view representation of a listening 
geometry for defining parameters of listener location; 
FIG. 4 is an elevational view corresponding to FIG. 

FIGS. Sa-Sk are plan views of respective listening 
situations with corresponding variations in loudspeaker 
placement and FIG. Sm is a table of critical dimensions 
for three listening rooms; 

FIG. 6 is a plan view of an image transfer experiment 
carried out in two isolated rooms; 

FIG. 7 is a process block diagram relating the present 
invention to prior art practice; 

FIG. 8 is a schematic in block diagram form of a 
sound imaging system according to an embodiment of 
the present invention; 

FIG. 9 is a pictorial representation of an operator ^ 

workstation according to an embodiment of the present 25 loudspeaker spacing distance (s) is measured along line 



30 



invention; 

FIG. 10 depicts a computer-graphic perspective dis- 
play used in controlling the present invention; 

FIG. 11 depicts a computer-graphic display of three 
orthogonal views used in controlling the present inven- 
tion; 

FIG. 12 is a schematic representation of the forma- 
tion of virtual sound sources by the present invention, 
showing a plan view of three isolated rooms; 

FIG, 13 is a schematic in block diagram form of 35 
equipment for demonstrating the present invention; 

FIQ, 14 is a waveform diagram of a test signal plotted 
as voltage against time; 

FIG. 15 tabulates data representing a transfer func- 
tion according to an embodiment of the present inven- ^ 
tion; 

FIG. 16 is a schematic in block diagram form of a 
sound image location system according to an embodi- 
ment of the present invention; 

FIGS. 17A and 17B are graphical representations of 45 
typical transfer functions employed in the sound proces- 
sors of FIG. 16; 

FIG. 18A-18C are schematic block diagrams of a 
circuit embodying the present invention; and 

FIG. 19 is a schematic block diagram of additional 
circuitry which further embodies the present invention. 

DETAILED DESCRIPTION OF PREFERRED 
EMBODIMENTS 

In order to define terms that will allow an unambigu- 
ous description of the auditory imaging process accord- 
ing to the present invention, FIGS. 1-4 show some 
dimensions and angles involved. 

FIG. 1 is a plan view of a stereo listening situation, 
showing left and right loudspeakers 101 and 102, re- 
spectively, a listener 103, and a sound image position 

104 that is apparent to listener 103. For purposes of 
definition only, the listener is shown situated on a line 

105 perpendicular to a line 106 joining loudspeakers 101 
and 102, and erected at the midpoint of line 106. This 
listener position will be referred to as the reference 65 
listener position, but with this invention the listener is 
not confined to this position. From the reference lis- 
tener position an image azimuth angle (a) is measured 



304, and a listener distance (d) is measured along line 

305. In the case that a listener is arranged parallel to line 
304 along line 306 to position 307, we define a lateral 
displacement (e) measured along line 306. For each 
loudspeaker 301 and 302 we define respective azimuth 
angles (p) and (q) as measured counterclockwise from a 
line through loudspeakers 301, 302 and perpendicular to 
a line joining them, in a direction toward the listener. 
Similarly for the listener we define an azimuth angle (m) 
counterclockwise from line 305 in the direction the 
listener is facing. 

In FIG. 4, a loudspeaker height (h) is measured up- 
ward from the horizontal line 401 through the head of 
the listener 303 to the vertical centerline of loudspeaker 
302. 

The parameters as defined allow more than one de- 
scription of a given geometry. For example, an image 
position may be described as (180,0,x) or (0,180,x) with 
complete equivalence. 

In conventional stereophonic reproduction the image 
is confined to lie along line 106 in FIG. 1, whereas the 
image produced by the present invention may be placed 
freely in space: azimuth angle (a) may range from 0-360 
degrees, and range (r) is not restricted to distances com- 
50 mensurate with (s) or (d). An image may be formed 
very close to the listener, at a small fraction of (d), or 
remote at a distance several times (d), and may simulta- 
neously be at any azimuth angle (a) without reference to 
the azimuth angle subtended by the loudspeakers. In 
55 addition, the present invention is capable of image 
placement at any altitude angle (b). Listener distance (d) 
may vary from 0.5 m to 30 m or beyond, with the image 
apparently static in space during the variation. 
Good image formation has ben achieved with loud- 
60 speaker spacings from <X2 m to 8 m, using the same 
signals to drive the loudspeakers from all spacings. 
Azimuth angles at the loudspeakers (p) and (q) may be 
varied independently over a broad range with no effect 
on tee image. 

It is characteristic of this invention that moderate 
changes in loudspeaker beight (h) do not affect the 
image altitude angle (b) perceived by the listener. This 
is true for both positive and negative values of (h), that 
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is to say loudspeaker placement above or below the 706, 707, which may then be returned to a mixing con- 
listener's head height. sole 708. It should be understood that the two-channel 

Since the image formed is extremely realistic, it is signals produced by this invention are not really left and 
natural for the listener to turn to "look at", that is to right stereo signals, however, such connotation pro- 
face directly toward, the image. The image remains 5 vides an easy way of referring to these signals. Thus, 
stable as this is done; listener azimuth angle (m) has no when all of the two-channel signals are mixed, all of the 
perceptible effect on the spatial position of the image, left signals are combined into one signal and all of the 
for at least a range of angles (m) from +120°to —120 right signals are combined into one signal. In practice, 
degrees. So strong is the impression of a localized sound console 703 and console 708 may be separate sections of 
source that listeners have no difficulty in "looking at" 10 the same console. Using console facilities, the processed 
or pointing to th image; a group of listeners will report signals may be applied to drive loudspeakers 709, 710 
the same image position. for monitoring purposes. After any required modifica- 

FIGS. 5a-5k shows a set of ten listening geometries £ on ^ i eve l setting, master stereo signals 711 and 712 

in which image stability has been tested. In FIG. 5a, a m j ec j t0 master stereo recorder 713, which may be a 

plan view of a listening geometry is shown. Left and 15 two-channel magnetic tape recorder. Items subsequent 

right loudspeakers 501 and 502 respectively reproduced t0 j tem 705 are well known in the prior art, 

sound for listener 503, producing a sound image 504. Sound image processing system 705 is shown in more 

Sub-FIGS. 5a through 5k show variations in loud- detai] in pj G g ( m wmc h input signals 801 correspond 

speaker orientation, and are generally similar to sub- t0 704 and output signals 807, 808 correspond 

FIG. 5cl 20 reS p ec tively to signals 711, 712 of FIG. 7. Each monau- 

All ten geometries were tested in three different lis- raJ m t signa] 801 ^ fed to ^ individual signal proces- 

tening rooms with different values of loudspeaker spac- SQr 

ing (s) and listener distance (d), as tabulated in FIG. 5m. processors 80 2 operate independently, with no 
Room 1 was a small studio control area containing interC oupling of audio signals. Each signal processor 
considerable amounts of equipment, room 2 as a large 25 ates to prod uce the two-channel signals having 
recording studio almost competely empty, and room 3 differentiaI phase md amplitude adjusted on a fre- 
was a small experimental room with sound absorbing quency dependent bask transfer functions will 
material on three walls. be explained i n detail below. The transfer functions, 
For each test the listener was asked to give the per- which K be described in the ^ dom ain as real 
ceived image position for two conditions; istener head 30 responses or equivalently in the frequency do- 
angle (m) zero, and head turned to face the apparent P^ P J ^ or ^ 

image position. Each test was repeated with three dif- resp onses, characterize only the desired image 

ferent hsteners. Thus the - age stabUi t y wa^sted m a ^ e ^ * %Q fee jected « 

total of 180 configurations. Each ^^J^g^ P One or more processed signal pairs 803 produced by 

tions used the same input signals to the loudspeakers^ In 35 proce s S ors are applied to the inputs of stereo 

every case the image azimuth angle (a) was perceived as JJ^^ or ^ of ? h P em may ^ applied t0 

In FiaTan image transfer experiment is shown in the inputs of a storage system 805. This system is cap* 
which a sound imaged is formed by signals processed ble of storing complete processed stereo audio signte 
according to the present invention, driving loudspeak- 40 and of replaying them simultaneously to appear at out- 
ers 602 and 603 /a first room 604. A dummy head 605, P»ts 806 ^Typically this storage s ?^ e ^ 
such as shown for instance in German Patent 1 927 401, ent numbers of input d"™*^ 
carries left and right microphones 606 and 607 in its P^s. V^*^^ 

model ears. Electrical signals on lines 608 and 609 from are applied to further inputs of stereo .new Mtott 

microphones 606, 607 are separately amplified by ampli- 45 mixer 804 sums all left inputs to produce 1ft output 807 

fiers 610 and 611, which drive left and right loudspeak- and all nght inputs to produce right output 808, possibly 

ers 612 and 613, respectively, in a second room 614. A modifying the amplitude of each input before summing, 

listener 615 situated in this second room, which is No interaction or coupling of left and nght channels 

acoustically isolated from the first room, will perceive a takes place m the mixer. 

sharp secondary image 616 corresponding to the image 50 A human operator 809 may control operation of the 

601 in the first room system via human interface means 810 to specify the 

An example of the relationship of the inventive sound desired image position to be assigned to each input 

processor to known systems is shown in FIG. 7, in channel. , 

which one or more multi-track signal sources 701, It may be particularly advantageous to implement 

which may be magnetic tape replay machines, feed a 55 signal processors 802 digitally, so that no limitation is 

plurality of monophonic signals 702 derived from a placed on the position, trajectory, or speed of motion of 

plurality of sources to a studio mixing console 703. The an image. These digital sound processors that provide 

console may be used to modify the signals, for instance the necessary differential adjustment of phase and am- 

by changing levels and balancing frequency content, in plitude on a frequency dependent basis will be explained 

any desired ways. 60 in more detail below. In such a digital implementation it 

A plurality of modified monophonic signals 704 pro- may not always be economic to provide for signal pro- 
duced by console 703 are connected to the inputs of an cessing to occur in real time, though such operation is 
image processing system 705 according to the present entirely feasible. If real-time signal processing is not 
invention. Within this system each input channel is provided, outputs 803 would be connected to storage 
assigned to an image position, and transfer function 65 system 805, which would be capable of slow recording 
processing is applied to produce two^channel signals and real-time replay. Conversely, if an adequate number 
from each single input signal 704. All of the two-chan- of real-time signal processors 802 are provided, storage 
nel signals are mixed to produce a final pair of signals system 805 may be omitted. 
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In FIG. 9, operator 901 controls mixing console 902 
equipped with left and right stereo monitor loudspeak- 
ers 903, 904. Although stability of the final processed 
image is good to a loudspeaker spacing (s) as low as 0.2 
m, it is preferable for the mixing operator to be pro- 5 
vided with loudspeakers placed at least 0.5 m apart. 
With such spacing, accurate image placement is more 
readily achieved. A computer graphic display means 
905, a multi-axis control 906, and a keyboard 907 are 
provided, along with suitable computing and storage 10 
facilities to support them. 

Computer graphic display means 905 may provide a 
graphic representation of the position or trajectory of 
the image in space as shown, for example, in FIGS. 10 
and 11. FIG. 10 shows a display 1001 of a listening 15 
situation in which a typical listener 1002 and an image 
trajectory 1003 are presented, along with a representa- 
tion of a motion picture screen 1004 and perspective 
space cues 1005, 1006. 

At the bottom of the display is a menu 1007 of items 20 
relating to the particular section of sound track being 
operated upon, including recording, time synchroniza- 
tion, and editing information. Menu items may be se- 
lected by keyboard 907, or by moving cursor 1008 to 
the item, using multi-axis control 906. The selected item 25 
can be modified using keyboard 907, or toggled using a 
button on multi-axis control 906, invoking appropriate 
system action. In particular, a menu item 1009 allows an 
operator to link the multi-axis control 906 by software 
to control the viewpoint from which the perspective 30 
view is projected, or to control the position/trajectory 
of the current sound image. Another menu item 1010 
allows selection of an alternate display illustrated in 
FIG. 11. 

In the display of FIG. 11 the virtually full-screen 35 
perspective presentation 1001 shown in FIG. 10 is re- 
placed by a set of three orthogonal views of the same 
scene; a top view 1101, a front view 1102, and a side 
view 1103. To aid in interpretation the remaining screen 
quadrant is occupied by a reduced and less detailed 40 
version 1104 of the perspective view 1001. Again a 
menu 1105, substantially similar to that shown at 1007 
and with similar functions, occupies the bottom of the 
screen. One particular menu item 1106 allows toggling 
back to th display of FIG. 10. 45 

In FIG. 12, sound sources 1201, 1202, and 1203 in a 
first room 1204 are detected by two microphones 1205 
and 1206 that generate right and left stereo signals, 
respectively, that are recorded using conventional ste- 
reo recording equipment 1207. If replayed on conven- 50 
tional stereo replay equipment 1208, driving right and 
left loudspeakers 1209, 1210, respectively, with the 
signals originating from microphones 1205, 1206, con- 
ventional stereo images 1211, 1212, 1213 corresponding 
respectively to sources 1201, 1202, 1203 will be per- 55 
ceived by a listener 1214 in a second room 1215. These 
images will be at positions that are projections onto the 
line joining loudspeakers 1209, 1210 of the lateral posi- 
tions of the sources relative to microphones 1205, 1206. 

If the two pairs of stereo signals are processed and 60 
combined as detailed above using sound processor 1216, 
and reproduced by conventional stereo playback equip- 
ment 1217 on right and left loudspeakers 1218, 1219 in a 
third room 1220, crisp spatially localized images of the 
sound sources are apparent to listener 1226 at positions 65 
unrelated to the actual positions of loudspeakers 1218, 
1219. Let us suppose that the processing was such as to 
form an image of the original right channel signal at 



position 1224, and an image of the original left channel 
signal at 1225. Each of these images behaves as if it were 
truly a loudspeaker; we may think of the images as 
"virtual loudspeakers" 

A transfer function in which both differential ampli- 
tude and phase of a two-channel signal are adjusted on 
a frequency dependent basis across the entire audio 
band is required to project an image of a monaural audio 
signal to a given position. For general applications to 
specify each such response, the amplitude and phase 
differential at intervals not exceeding 40 Hz must be 
specified independently for each of the two channels 
over the entire audio spectrum, for best image stability 
and coherence. For applications not requiring high 
quality and sound image placement the frequency inter- 
vals may be expanded. Hence specification of such a 
response requires about 1000 real numbers (or equiva- 
lent^, 500 complex ones). Differences for human per- 
ception of auditory spatial location are somewhat indef- 
inite, being based on subjective measurement, but in a 
true three-dimensional space more than 1000 distinct 
positions are resolvable by an average listener Exhaus- 
tive characterization of all responses for all possible 
positions therefore constitutes a vast body of data, com- 
prising in all more than on million real numbers, the 
collection of which is in progress. 

It should be noted that the transfer function in the 
sound processor according to this invention, which 
provides the differential adjustment between the two 
channels, is build up piece-by-piece by trail and error 
testing over the audio spectrum for each 40 Hz interval. 
Moreover, as will be explained below, each transfer 
function in the sound processor locates the sound rela- 
tive to two spaced-apart transducers at only one loca- 
tion, that is, one azimuth, height, and depth. 

In practice, however, we need not represent all trans- 
fer function responses explicitly, as mirror-image sym- 
metry generally exists between the right and left chan- 
nels. If the responses modifying the channels are inter- 
changed, the image azimuth angle (a) is inverted, whilst 
the altitude (b) and range (r) remain unchanged. 

It is possible to demonstrate the inventive process and 
the auditory illusion using conventional equipment and 
by using simplified signals. If a burst of a sine wave at a 
known frequency is gated smoothly on and off at rela- 
tively long intervals, a very narrow band of the fre- 
quency domain is occupied by the resulting signal. Ef- 
fectively, this signal will sample the required response 
at a single frequency. Hence the required responses, 
that is, the transfer functions, reduce to simple control 
of differential amplitude and phase (or delay) between 
the left and right channels on a frequency dependent 
basis. Thus, it will be appreciated that the transfer func- 
tion for a specific sound placement can be built up em- 
pirically by making differential phase and amplitude 
adjustments for each selected frequency interval over 
the audio spectrum. By Fourier's theorem any signal 
may be represented as the sum of a series of sine waves, 
so the signal used is completely general. 

An example, of a system for demonstrating the pres- 
ent invention is shown in FIG. 13, in which an audio 
synthesizer 1302, a Hewlett-Packard Multifunction 
Synthesizer model 8904A, is controlled by a computer 
1301, Hewlett-Packard model 330M, to generate a mon- 
aural audio signal that is fed to the inputs 1303, 1304 of 
two channels of an audio delay line 1305, Eventide 
Precision Delay model PD860. From delay line 1305 
the right channel signal passes to a switchable inverter 



5,105,462 

9 10 

1306 and left and right signals then pass through respec- The system of FIG. 16 can be simplified, as shown 
tive variable attentuators 1307, 1308 and hence to two from the following analysis. Firstly, only the difference 
power amplifiers 1309, 1310 driving left and right loud- or differential between the delays of the two channels is 
speakers 1311, 1312, respectively. of interest. Suppose that the left and right channel de- 
Synthesizer 1302 produces smoothly gated sine wave 5 lays are t(l) and t(r) respectively. New delays t'(l) and 
bursts ofany desired test frequency 1401, using an en ve- t'<r) are defined by adding any fixed delay t(a), such 
lope as shown in FIG. 14. The sine wave is gated on that: 
using a first linear ramp 1402 of 20 ms duration, dwells 

at constant amplitude 1403 for 45 ms, and is then gated '(O-KD+K*) W 

off using a second linear ramp 1404 of 20 ms duration. 10 ft r )=t(r)+t(a) (2) 

Bursts are repeated at intervals 1405 of about 1-5 sec- w 

on T d ' JJB . . , r mr 11 n ^ A fl The result is that the entire effect is heard a time t(a) 

In addition, using the system of FIG. 13 and the Qr whfire <a) ^ negative . general 

waveform of FIG. 14, the present invention can build ^ holds fa the ial ^ where t(a) = _ t(r) . 
up a transfer function over the audio spectrum by ad- 1J Su ^ stituting . 
justing the time delay in delay line 1305 and the ampli- 
tude by attentuators 1307, 1308. A listener would make f <i)-i(i)-<W (3) 
the adjustment, listen to the sound placement and deter- 
mine if it was in the right location If so, the next fre- ^ f(r)=4')-'to i! =o (4) 
quency interval would be examined. If not, then further 

adjustments are made and the listening process re- By this transformation we can always reduce the delay 

peated. In this way the transfer function over the audio in one channel to zero. In a practical implementation we 

spectrum can be built-up. must be careful to subtract out the smaller delay, so that 

FIG. 15 is a table of practical data to be used to form 25 the need for a negative delay never arises. It may be 

a transfer function suitable to allow reproduction of preferred to avoid this problem by leaving a fixed resid- 

auditory images well off the direction of the loudspeak- ual delay in one channel, and changing the delay in the 

e'rs for several sine wave frequencies. This table might other. If the fixed residual delay is of sufficient magni- 

be developed just as explained above, by trial and error tude, the variable delay need not be negative, 

listening. All of these images were found to be stable 3Q Secondly, we need not control channel amplitudes 

and repeatable in all three listening rooms detailed in independently. It is a common operation in audio engi- 

FIG. 5m, for a broad range of listener head attitudes neering to change the amplitudes of signals either by 

including directly facing the image, and for a variety of amplification or attenuation. So long as both stereo 

listeners. channels are changed by the same ratio, there is no 

We may generalize the placement of narrowband 35 change in the positional information carried. It is the 

signals, detailed above, in such a manner as to permit ratio or differential of amplitudes that is important and 

broadband signals, representing complicated sources must be preserved. So long as this differential is pre- 

such as speech and music, to be imaged. If the differen- served, all of the effects and illusions in this description 

tial amplitudes and phase shifts for the two channels are entirely independent of the overall sound level of 

that are derived from a single input signal are specified 40 reproduction. Accordingly, by an operation similar to 

for all frequencies though the audio band, the complete that detailed above for timing or phase control, we may 

transfer function is specified. In practice, we need only place ail of the amplitude control in one channel, leav- 

explicitly specify the differential amplitudes and delays ing the other at a fixed amplitude. Again, it may be 

for a number of frequencies in the band of interest. convenient to apply a fixed residual attentuation to one 

Amplitudes and delays at any intermediate frequency, 45 channel, so that all required ratios are attainable by 

between those specified, may then be found by interpo- attenuation of the other. Full control is then available 

lation. If the frequencies at which the response is speci- using a variable attenuator in one channel only, 

fied are not too widely spaced, and taking into account We may thus specify all the required information by 

the smoothness or rate of change of the true response specifying the differential attentuation and delay as 

represented, the method of interpolation is not too criti- 50 functions of frequency for a single channel. A fixed, 

ca I frequency-independent attentuation and delay may be 

In the table of FIG. 15, the amplitudes and delays are specified for the second channel; if these are left unspec- 
applied to the signal in each channel and this is shown ified, we assume unity gain and zero delay, 
generally in FIG. 16 in which a separate sound proces- Thus, for any one sound image position, and there- 
sor 1500, 1501 is provided. The single channel audio 55 fore any one left/right transfer function, the differential 
signal is fed in at 1502 and fed to both sound processors phase and amplitude adjusting (filtering) may be orga- 
1500, 1501 where the amplitude and phase are adjusted nized all in one channel or the other or any combination 
on a frequency dependent basis so that the differential at in between. One of sound processors 1500, 1501 can be 
the left and right channel outputs 1503, 1504, respec- simplified to no more than a variable impedance or to 
tively, is the correct amount that was empirically deter- 60 just a straight wire. It can not be an open circuit As- 
mined, as explained above. The control parameters fed suming that the phase and amplitude adjusting is per- 
in on line 1505 change the differential phase and ampli- formed in only one channel to provide the necessary 
tude adjustment so that the sound image can be at a differential between the two channels the transfer func- 
different, desired location. For example, in a digital tions would then be represented as in FIGS. 17A and 
implementation the sound processors could be finite 65 17B. 

impulse response (FIR) filters whose coefficients are FIGS. 17A represents a typical transfer function for 

varied by the control parameter signal to provide differ- the differential phase of the two channels, wherein the 

ent effective transfer functions. left channel is unaltered and the right channel under- 
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goes phase adjustment on a frequency dependent basis achieved. A fully programmable digital filter is appro- 
over the audio spectrum. Similarly, FIG. 17B repre- priate to meet this requirement, 
sents generally a typical transfer function for the differ- Such a digital filter may operate in the frequency 
ential amplitude of the two channels, wherein the ampli- domain, in which case, the signal is first Fourier trans- 
tude of the left channel is unaltered and the right chan- 5 formed to move it from a time domain representation to 
nel undergoes attentuation on a frequency dependent a frequency domain one. The filter amplitude and phase 
basis over the audio spectrum. response, determined by one of the above methods, is 

It is appreciated that the sound positioners: 1500, then applied to the frequency domain representation of 

1501 of FIG. 16, for example, can be analog or digital the signal by complex multiplication. Finally, an inverse 

and may include some or all of the following circuit 10 Fourier transform is applied, bnnging the signal back to 

elements: filters, delays, inventors, summers, amplifiers, *e time domain for digital to analog conversion, 

and phase shifters. These functional circuit elements can Alternatively, we may specify the response directly 

be organized in any fashion that results in the transfer " the time domain as a real impulse response. This 

function response is mathematically equivalent to the frequency 

Several equivalent representations of this information 15 amplitude and phase response, and may be ob- 

are possible, and are commonly used in related arts. ^^it^f^hc^im^mymcFo^m 

For example, the delay maybe specified as a phase W * ■»* ^^!^^^^^ 

, . • f • *t. the time domain by convolving it with the time domain 

change at any grven frequency, using the equivalences: representation of ^ signaJ * may ^ demonstrated 

Phase (degrees)-^ x&hy time)xfrequency 20 that the operation of convolution in the time domain is 

mathematically identical with the operation of multiph- 
Phase (radians)=2x{defoy time)x frequency cation in the frequency domain, so that the direct con- 
volution is entirely equivalent to the frequency domain 
Caution in applying this equivalence is required, be- operation detailed in the preceding paragraph, 
cause it is not sufficient to specify the principal value of 25 Since all digital computations are discrete rather than 
phase; the full phase is required if the above equiva- continuous, a discrete notation is preferred to a continu- 
ances are to hold. ous one. It is convenient to specify the response directly 
A convenient representation commonly used in elec- in terms of the coefficients which will be applied in a 
tronic engineering is the complex s-plane representa- recursive direct convolution digital filter, and this is 
tion. All filter characteristics realizable using real ana- 30 readily done using a z-plane notation that parallels the 
log components (any many that are not) may be speci- s-plane notation. Thus, if T(z) is s time domain response 
fied as a ratio of two polynomials in the Laplace com- equivalent to T(s) m the frequency domain: 



plex frequency variable s. The general form is: 



™ _ _ £M- (5) 

JKS) " £out(j) ~ D(s) 



35 7U) = ^ 



(8) 



Where N(z) and D(z) have the form: 

A r U)=c 0 +c I r- I + c 2 2- 2 +. . . +0*- fl (9) 



Where T(s) is the transfer function in the s plane, 
Ein(s) and Eout(s) are the input and output signals re- ^ 

spectively as functions of s, and the numerator and txz)=d 0 +d\z- x +d2z- 2 -f. . . +dma- m (10) 



denominator functions N(s) and D(s) are of the form: 



In this notation the coefficients c and d suffice to 
specify the function as the a and b coefficients did in the 

xr/ * , , ii_«2it_3i ■ ji „, 45 s-plane, so equal compactness is possible. The z-plane 

Md-W.i+^+- • • + W 0) fl £ er may be H impl e m ented directly if the operator z is 

The attraction of this notation is that it may be very interpreted such that 

„ . - . i ♦ i ♦ II r is a delay of n sampling intervals, 

compact. To specify the function completely at all fre- specifying coefficients c and d are directly the 

quencies without need of interpolation we need only ^ m dtiplym£coe^^ 

specify the n + 1 coefficients a and the ji + 1 coefficients ^ iflcation to use only negative vmm of 

b. With these coefficients specified, the amplitude and since ^ GomspoadB to delays . A positive 

phase of the transfer function at any frequency may f of % WQu]d correspond t0 a negative de i ay , that 

readily be derived using well-known methods. A fur- is a response ^fore a stimulus was applied, 

ther attraction of this notation is that it is the form most 55 Wjth these notations m hand we may described 

readily derived from analysis of an analog circuit, and equipment to allow placement of images of broad and 

therefore, stands as the most natural, compact, and ^nds such as speech and music. For these purposes 

well-accepted method of specifying the transfer func- ^ e processor of the present invention, for exam- 

tion of such a circuit. pie, processor 802 of FIG. 8 ( may be embodied as a 

Yet another representation convenient for use in de- go variable two-path analog filter with variable path cou- 

scribing the present invention is the z-plane representa- pling attenuators as in Fig. 18A. 

tion. In the preferred embodiment of the present inven- i n FIG. 18A, a monophonic or monaural input signal 

tion, the signal processor will be implemented as digital 1601 is input to two filters 1610, 1630 and also to two 

filters in order to obtain the advantage of flexibility. potentiometers 1651, 1652. The outputs from filters 

Since each image position may be defined by a transfer 65 1610, 1630 are connected to potentiometers 1653, 1654. 

function, we need a form of filter in which the transfer The four potentiometers 1651-1654 are arranged as a 

function may be readily and rapidly realized with a so-called joystick control such that they act differen- 

minimum of restrictions as to which functions may be tially. One joystick axis allows control of potentiome- 
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14 



10 



15 



ters 1651, 1652; as one moves such as to pass a greater 
proportion of its input to its output, the other is mechan- 
ically reversed and passes a smaller proportion of its 
input to its output. Potentiometers 1653, 1654 are simi- 
larly differentially operated on a second, independent 
joystick axis. Output signals from potentiometers 1653, 
1654 are passed to unity gain buffers 1655, 1656 respec- 
tively, which in turn drive potentiometers 1657, 1658, 
respectively, that are coupled to act together; they in- 
crease or decrease the proportion of input passed to the 
output in step. The output signals from potentiometers 
1657, 1658 pass to a reversing switch 1659, which al- 
lows the filter signals to be fed directly or interchanged, 
to first inputs of summing elements 1660, 1670. 

Each responsive summing element 1660, 1670 re- 
ceives at its second input an output from potentiometers 
1651, 1652. Summing element 1670 drives inverter 1690, 
and switch 1691 allows selection of the direct or in- 
verted signal to drive input 1684 of attenuator 1689. The 
output of attenuator 1689 is the so-called right-channel 20 
signal. Similarly summing element 1660 drives inverter 
1681, and switch 1682 allows selection of the direct or 
inverted signal at point 1683. Switch 1685 allows selec- 
tion of the signal 1683 or the input signal 1601 as the 
drive to attenuator 1686 which produces left channel 25 
output 1688. 

Filter 1610, 1630 are identical, and one is shown in 
detail in FIG. 18B. A unity gain buffer 1611 receives the 
input signal 1601 and is capacitively coupled via capaci- 
tor 1612 to drive filter element 1613. Similar filter ele- 
ments 1614 to 1618 are cascaded, and final filter element 
1618 is coupled via capacitor 1619 and unity gain buffer 

1620 to drive inverter 1621. Switch 1622 allows selec- 
tion of either the output of buffer 1620 or of inverter 

1621 at filter output 1623. 
Filter elements 1613 through 1618 are identical and 

are shown in detail in FIG. 18C. They difTer only in the 
value of their respective capacitor 1631. Input 1632 is 
connected to capacitor 1631 and resistor 1633 and resis- 
tor 1633 is coupled to the inverting input of operational 40 
amplifier 1634, output 1636 is the filter element output. 
Feedback resistor 1635 is connected to operational am- 
plifier 1634 in the conventional fashion. The non-invert- 
ing input of operational amplifier 1634 is driven from 
the junction of capacitor 1631 and one of resistors 1637 45 
to 1642, as selected by switch 1643. This filter is an 
all-pass filter with a phase shift that varies with fre- 
quency according to the setting of switch 1643. 

Table 1 lists the values of capacitor 1631 used in each 
filter element 1613-1618, and Table 2 lists the resistor 
values selected by switch 1642; these resistor values are 
the same for all filter elements 1613-1618. 

One embodiment of summing elements 1660, 1670 is 
shown in FIG. 18D, in which two inputs 1661, 1662 for 
summing in operational amplifier 1663 result in a single 55 
output 1664. The gains from input to output are deter- 
mined by the resistors 1665, 1667 and feedback resistor 
1666. In both cases input 1662 is driven from switch 
1659, and input 1661 from joystick potentiometers 1651, 
1652 respectively. 

As examples of image placement, Table 3 shows set- 
tings and corresponding image positions to "fly" a 
sound image corresponding to a helicopter at positions 
well above the plane including the loudspeakers and the . 
listener. To obtain the required monophonic signal for 65 
the process according to the present invention, the ste- 
reo tracks on the sound effects disc were summed. With 
the equipment shown set up as tabulated, realistic sound 



images are projected in space in such a manner that the 
listener perceives a helicopter at the locations tabulated. 

TABLE 1 



Filter # 



1 



Capacitor 1631 
Value, nF 



100 



47 



33 



15 



10 



4.7 



TABLE 2 



Switch 1642 
Position # 



Resistor # 
Resistor 
value, Ohms 



1637 
4700 



1638 
1000 



1639 
470 



1640 
390 



1641 
120 



TABLE 3 



30 



35 



Filter 1630 element 1 switch pos. 


5 


5 


Filter 1630 element 2 switch pos. 


5 


5 


Filter 1630 element 3 switch pos. 


5 


5 


Filter 1630 element 4 switch pos. 


5 


5 


. Filter 1630 element 5 switch pos. 


5 


5 


Filter 1630 inverting switch 1622 


norm. 


norm. 


Potentiometer 1652 ratio 


0.046 


0.054 


Potentiometer 1654 ratio 


0.90 


0.76 


Potentiometer 1658 ratio 


0.77 


0.77 


Inverting switch 1691 position 


inv. 


inv. 


Selector switch 1685 position 


1601 


1601 


Output attenuator 1686 ratio 


0.23 


0.23 


Outpul attenuator 1687 ratio 


1.0 


1.0 


Image azimuth a, degrees 


-45 


-30 


Image altitude b, degrees 


+21 


+ 17 


Image range r 


remote 


remote 



50 



Note to table 3: setting of reversing switch 1659 in both cases is such thai signals 
from element 1657 drive element 1660, and those from element 1658 drive element 
1670. 

By addition of two extra elements to the above cir- 
cuits, an extra facility for lateral shifting of the listening 
area is provided. It should be understood, however, that 
this is not essential to the creation of images. The extra 
elements are shown in FIG. 19, in which left and right 
signals 1701, 1702 may be supplied from the outputs 
1688, 1689 respectively of the signal processor of FIG. 
16. In each channel a delay 1703, 1704 respectively is 
inserted, and the output signals from the delays 1703, 
1704 become the sound processor outputs 1705, 1706. 

The delays introduced into the channels by this addi- 
tional equipment are independent of frequency. They 
may thus each be completely characterized by a single 
real number. Let the left channel delay be t(l), and the 
right channel delay t(r). As in the above case, only the 
differential between the delays is significant, and we can 
completely control the equipment by specifying the 
difference between the delays. In implementation, we 
will add a fixed delay to each channel to ensure that at 
least no negative delay is required to achieve the re- 
quired differential. Defining a differential delay t(d) as: 



60 



If t(d) is zero, the effects produced will be essentially 
unaffected by the additional equipment If t(d) is posi- 
tive, the center of the listening area will be displaced 
laterally to the right along dimension (e) of FIG. 3. A 
positive value of t(d) will correspond to a positive value 
of (e), signifying rightward displacement. Similarly, a 
leftward displacement, corresponding to a negative 
value of (e), may be obtained by a negative value of t(d). 
By this method the entire listening area, in which listen- 
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ers perceive the illusion, may be projected laterally to different from the location of said sound transducer 

any point between or beyond the loudspeakers. It is means. 

readily possible for dimension (e) to exceed half of di- 2. The method of claim 1 further including the step of 

mension (s), and good results have been obtained out to applying said first and second channel signals to respec- 

extreme shifts at which dimension (e) is 83% of dimen- 5 tive «« P* 55 fllters ' each said J l l ter ^ avin S a P rede | er " 

sion (s). This may not be the limit of the technique, but pined frequency response and topology as character- 

w * " v J - om>»~ a ~* n *^ ized by an empirically derived transfer function T(s) for 

represents the limit of current experimental ^ e Lap lace complex frequency variable (s). 

SUMMARY OF THE INVENTION 3. The method of claim 2 wherein the step of apply- 
_ , i j i ing at least one of said signals to at least one filter in- 
Two ordinary, spaced-apart loudspeakers can pro- , 0 cludes the further step of applying said at least one 
duce a sound image that appears to the listener to be .j tQ a cascaded ser i e s of filters, 
emanating from a location other than the actual location 4 ^ method 0 f c iaim l further including the step of 
of the loudspeakers. The sound signals are processed storing said first and second channel signals and modi- 
according to this invention before they are reproduced fied de rived therefrom in a medium capable of 
so that no special playback equipment is required. Al- 15 regenerating said stored signals at a subsequent selected 
though two loudspeakers are required the sound pro- tmiet 

duced is not the same as conventional stereophonic, left $ t The method of claim 1 wherein the step of altering 
and right, sound however, stereo signals can be pro- the amplitude and shifting the phase includes respec- 
cessed and improved according to this invention. The tively passing said first and second channel signals 
inventive sound processing involves dividing each mon- 2 q through first and second sound processors having re- 
aural or single channel signal into two signals and then spective predetermined transfer functions to effect said 
adjusting the differential phase and amplitude of the differential phase shift, whereby phase is shifted on a 
two channel signals on a frequency dependent basis in frequency dependent basis across the audio spectrum 
accordance with an empirically derived transfer func- and in which each phase shift is different than the pre- 
tion. The results of this is processing is that the apparent ceding phase shift, and a predetermined amplitude 
sound source location can be placed as desired, pro- ° transfer function to effect said differential amplitude 
vided that the transfer function is properly derived. alteration. . 
Each transfer function has an empirically derived phase « The method of claim 5. wherein the predetermined 
and amplitude adjustment that is built-up for each pre- phase and amplitude transfer functions are constructed 
determined frequency interval over the entire audio „ °na * e «T%^ 
spectrumand^ 30 J " 

tion. By providing a ^^^^^^ ^ space, an auditory stnsory illusion of an apparent origin 
fer functions and selecting them accordingly the sound f P, Qne SQUnd at fl predet £mi„ e d local- 
source can appear to the listener to move. The transfer ^ ition located the three-dimensional space 
function can be implemented by analog circuit compo- containing a listener from a single electrical signal cor- 
nents or the monaural signal can be digitalized and resp onding to the selected sound, comprising: first and 
digital filters and the like employed. second channel means both receiving the same single 
We claim: electrical signal, said first and signal channel means 
1. A method for producing and locating an apparent including respective first and second sound processor 
origin of a selected sound from an electrical signal cor- means each for altering the amplitude and shifting the 
responding to the selected sound in a predetermined 40 phase angle of the respective electrical signal on a fre- 
and localized position anywhere within the three-di- quency dependent basis for successive discrete fre- 
mensional space containing a listener, comprising the quency intervals across the audio spectrum to produce 
steps of: a respective modified signal wherein the amplitude 
separating said electrical signal into respective first alteration differential and the phase angle shift differen- 
and second channel signals; 45 tial occurring between the two channels are respective 
altering the amplitude and shifting the phase of the predetermined values for each said successive fre- 
signal in both said first and second channel signals quency interval of the audio spectrum, said sound pro- 
while maintaining said phase and amplitude differ- cessor means shifting the phase angle such that each 
ential therebetween for successive discrete fre- successive phase angle shift is ; different and independent 
quency bands across the audio spectrum and each 50 of a preceding phase angle shift relative to zero degrees 
successive phase shift being different than the pre- and said first and second channels bemg m^ned 
ceding phase shift, relatives zero degrees, thereby «* P nor to **** fcd to two trans " 
producing first channel system as in claim 7 further including storage 

M ?F?*™ W ^£^^^^^£ 55 means connected to said sound processor means for 

amplitude differential between the two channel 55 ^ modified signals m * medl]im capable of 

signals; regenerating said stored signals at a subsequent selected 

maintaining the first channel signal separate and apart 

from the second channel signal following the step 9 * A system ^ m claim 7 wherein the sound processor 

of altering the amplitude and shifting the phase; me ans comprises a sound processor having a predeter- 

and 60 mined amplitude transfer function for producing the 

respectively applying said first and second channel amplitude differential on a frequency dependent basis 

modified signals that are maintained separate and md having a predetermined phase transfer function for 

apart and that have said phase and amplitude differ- producing the phase angle differential on a frequency 

ential therebetween to first and second transducer dependent basis. 

means located within the three-dimensional space 65 10. A system as in claim 9, wherein the frequency 

and spaced part from the listener to produce a dependent basis is made up of said intervals being 40 Hz 

sound apparently originating at a predetermined wide. 

location in the three-dimensional space that may be * • 0 0 0 



